Recognition and normalization of disease mentions in PubMed abstracts
نویسندگان
چکیده
The rapidly increasing number of available PubMed documents calls the need for an automatic approach in the identification and normalization of disease mentions in order to increase the precision and effectivity of information retrieval. We herein describe our team’s participation for the Disease Named Entity Recognition and Normalization subtask under the chemical-disease relations track of the BioCreative V shared task. We developed a CRF-based model using BIESO tagging format to allow automated recognition of disease entities in PubMed abstracts. Recognized disease entities were normalized to MeSH concepts using a dictionary look-up method based on Lucene. Performance is reported using precision, recall and F-measure on three separate runs. Our best run achieved F-measure of 80.74% on disease mention recognition and 67.85 % on disease normalization.
منابع مشابه
Disease Named Entity Recognition and Normalization using Conditional Random Fields and Levenshtein Distance
This presents a machine learning-based approach for disease named entity recognition and normalization (DNER) subtask of Chemical Disease Relation (CDR) task in BioCreative V. This approach employs a Conditional Random Fields (CRF) based model with domain specific features in biomedical area in disease named entity recognition. In order to improve the performance of entity normalization, the me...
متن کاملNCBI disease corpus: A resource for disease name recognition and concept normalization
Information encoded in natural language in biomedical literature publications is only useful if efficient and reliable ways of accessing and analyzing that information are available. Natural language processing and text mining tools are therefore essential for extracting valuable information, however, the development of powerful, highly effective tools to automatically detect central biomedical...
متن کاملDNorm: disease name normalization with pairwise learning to rank
MOTIVATION Despite the central role of diseases in biomedical research, there have been much fewer attempts to automatically determine which diseases are mentioned in a text-the task of disease name normalization (DNorm)-compared with other normalization tasks in biomedical text mining research. METHODS In this article we introduce the first machine learning approach for DNorm, using the NCBI...
متن کاملHuman Gene Name Normalization using Text Matching with Automatically Extracted Synonym Dictionaries
The identification of genes in biomedical text typically consists of two stages: identifying gene mentions and normalization of gene names. We have created an automated process that takes the output of named entity recognition (NER) systems designed to identify genes and normalizes them to standard referents. The system identifies human gene synonyms from online databases to generate an extensi...
متن کاملImproving the dictionary lookup approach for disease normalization using enhanced dictionary and query expansion
The rapidly increasing biomedical literature calls for the need of an automatic approach in the recognition and normalization of disease mentions in order to increase the precision and effectivity of disease based information retrieval. A variety of methods have been proposed to deal with the problem of disease named entity recognition and normalization. Among all the proposed methods, conditio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015